This notebook outlines my process of tree based and SVM models. This notebook is dependent on the data table gameInfo generated from DataExtraction.RMD.
Loading Data from other part
load("../data/league.RDATA")
Packages
library(plotly)
Warning: package ‘plotly’ was built under R version 4.1.2
Registered S3 method overwritten by 'htmlwidgets':
method from
print.htmlwidget tools:rstudio
Attaching package: ‘plotly’
The following object is masked from ‘package:ggplot2’:
last_plot
The following object is masked from ‘package:stats’:
filter
The following object is masked from ‘package:graphics’:
layout
List to Store Results
Wrangling Data
So I want to make a basic tree classifier of projected winning team comps. For now, a basic model of simple champion tags will be used.
Setting up Training / Test Data
# Setting Seed for Reproducibility
set.seed(3)
data.tree$temp.data$sample <- sample(data.tree$temp.data$gameInfo.tree$match, nrow(data.tree$temp.data$gameInfo.tree)*.7)
data.tree$temp.data$train <- data.tree$temp.data$gameInfo.tree %>%
filter(match %in% data.tree$temp.data$sample)
data.tree$temp.data$test <- data.tree$temp.data$gameInfo.tree %>%
filter(!match %in% data.tree$temp.data$sample)
Generating Random Forest
set.seed(3)
data.tree$models$teamComp_forest <- randomForest(
team_win ~ . - match,
data = data.tree$temp.data$train,
ntree = 500,
importance = TRUE,
na.action = na.omit
)
data.tree$models$teamComp_forest
Call:
randomForest(formula = team_win ~ . - match, data = data.tree$temp.data$train, ntree = 500, importance = TRUE, na.action = na.omit)
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 3
OOB estimate of error rate: 50.08%
Confusion matrix:
1 2 class.error
1 11396 12530 0.5236981
2 11377 12438 0.4777241
importance(data.tree$models$teamComp_forest)
1 2 MeanDecreaseAccuracy MeanDecreaseGini
Assassin_1 9.9540136 -6.814106 5.1561345 332.2679
Fighter_1 17.2205799 -13.858278 5.5311721 346.1510
Marksman_1 9.5225581 -8.116976 1.7123134 293.4874
Tank_1 7.4799838 -4.724412 3.9559451 285.2230
Mage_1 16.9019553 -14.924233 2.9464378 284.1115
Support_1 12.2424356 -11.952428 1.8623555 242.4686
Assassin_2 0.3371864 -2.088303 -2.3073435 359.3653
Fighter_2 2.4827423 -4.897604 -2.7091299 425.1212
Marksman_2 3.2506393 -4.942703 -1.7623174 369.9343
Tank_2 3.0022087 -5.144555 -2.4517398 338.7178
Mage_2 1.0833946 -1.633843 -0.6288478 389.8527
Support_2 0.6855808 -5.485852 -5.9403592 288.7111
varImpPlot(data.tree$models$teamComp_forest)

Let’s compare to a simple blue side always wins classifier:
data.tree$temp.data$gameInfo.tree %>%
count(team_win) %>%
mutate(n = n/sum(n))
Well, it’s slightly better than the naive blue side win classifier but clearly the number of champions with tags isn’t a very strong predictor of team success. With the current coding, I’m fairly certain that there won’t really be a robust classifier.
Let’s try to identify clusters of champion types. # Generating Input Team Sentences
championCluster$temp.data$teams <- gameInfo %>%
select(match, win, championName) %>%
group_by(match, win) %>%
mutate(championNumber = row_number()) %>%
pivot_wider(
names_from = championNumber,
values_from = championName
) %>%
transmute(match = match, win = win, team = str_c(`1`,`2`,`3`,`4`,`5`, sep = " ")) %>%
ungroup() %>%
select(team)
championCluster$temp.data$teams
Generating Model
Pretty clearly 5 main clusters of champions each corresponding to a role. Doesn’t really help too much in determining team compositions. I could set up a KNN to verify this but it seems pretty clear cut to me.
LS0tDQp0aXRsZTogIlRyZWVzIGFuZCBTdXBwb3J0IFZlY3RvciBNYWNoaW5lcyINCm91dHB1dDogaHRtbF9ub3RlYm9vaw0KLS0tDQoNClRoaXMgbm90ZWJvb2sgb3V0bGluZXMgbXkgcHJvY2VzcyBvZiB0cmVlIGJhc2VkIGFuZCBTVk0gbW9kZWxzLiBUaGlzIG5vdGVib29rIGlzIGRlcGVuZGVudCBvbiB0aGUgZGF0YSB0YWJsZSBnYW1lSW5mbyBnZW5lcmF0ZWQgZnJvbSBEYXRhRXh0cmFjdGlvbi5STUQuDQoNCiMgTG9hZGluZyBEYXRhIGZyb20gb3RoZXIgcGFydA0KYGBge3J9DQpsb2FkKCIuLi9kYXRhL2xlYWd1ZS5SREFUQSIpDQpgYGANCg0KDQojIFBhY2thZ2VzDQpgYGB7cn0NCmxpYnJhcnkodGlkeXZlcnNlKQ0KbGlicmFyeShkYXRhLnRhYmxlKQ0KbGlicmFyeShyYW5kb21Gb3Jlc3QpDQpsaWJyYXJ5KHJwYXJ0LnBsb3QpDQpsaWJyYXJ5KHdvcmQydmVjKQ0KbGlicmFyeShSdHNuZSkNCmxpYnJhcnkocGxvdGx5KQ0KYGBgDQoNCiMgTGlzdCB0byBTdG9yZSBSZXN1bHRzDQpgYGB7cn0NCmRhdGEudHJlZSA8LSBsaXN0KA0KICBtb2RlbHMgPSBsaXN0KCksDQogIHBsb3RzID0gbGlzdCgpLA0KICB0ZW1wLmRhdGEgPSBsaXN0KCkNCikNCmNoYW1waW9uQ2x1c3RlciA8LSBsaXN0KA0KICBtb2RlbHMgPSBsaXN0KCksDQogIHBsb3RzID0gbGlzdCgpLA0KICB0ZW1wLmRhdGEgPSBsaXN0KCkNCikNCmBgYA0KDQoNCiMgV3JhbmdsaW5nIERhdGENClNvIEkgd2FudCB0byBtYWtlIGEgYmFzaWMgdHJlZSBjbGFzc2lmaWVyIG9mIHByb2plY3RlZCB3aW5uaW5nIHRlYW0gY29tcHMuIEZvciBub3csIGEgYmFzaWMgbW9kZWwgb2Ygc2ltcGxlIGNoYW1waW9uIHRhZ3Mgd2lsbCBiZSB1c2VkLg0KYGBge3J9DQpkYXRhLnRyZWUkdGVtcC5kYXRhJGdhbWVJbmZvLnRlbXAgPC0gZ2FtZUluZm8gJT4lIA0KICBsZWZ0X2pvaW4oDQogICAgY2hhbXBpb25zLnNjcmFwZWQsDQogICAgYnkgPSBjKCJjaGFtcGlvbk5hbWUiID0gIm5hbWUiKQ0KICApICU+JSANCiAgZ3JvdXBfYnkobWF0Y2gpICU+JSANCiAgbXV0YXRlKA0KICAgIHRlYW0gPSBybGVpZCh3aW4pDQogICkgJT4lIA0KICB1bmdyb3VwKCkNCg0KZGF0YS50cmVlJHRlbXAuZGF0YSRnYW1lSW5mby50YWdzIDwtIGRhdGEudHJlZSR0ZW1wLmRhdGEkZ2FtZUluZm8udGVtcCAlPiUgDQogIGdyb3VwX2J5KG1hdGNoLCB0ZWFtKSAlPiUgDQogIGNvdW50KHRhZykgJT4lIA0KICB1bmdyb3VwKCkgJT4lIA0KICBwaXZvdF93aWRlcigNCiAgICBuYW1lc19mcm9tID0gdGFnLA0KICAgIHZhbHVlc19mcm9tID0gbg0KICApICU+JSANCiAgcGl2b3Rfd2lkZXIoKSAlPiUgDQogIHJlcGxhY2UoaXMubmEoLiksIDApIA0KDQoNCmRhdGEudHJlZSR0ZW1wLmRhdGEkZ2FtZUluZm8udHJlZSA8LSBkYXRhLnRyZWUkdGVtcC5kYXRhJGdhbWVJbmZvLnRlbXAgJT4lIA0KICBmaWx0ZXIod2luID09IFRSVUUpICU+JSANCiAgc2VsZWN0KG1hdGNoLCB0ZWFtX3dpbiA9IHRlYW0pICU+JSANCiAgZGlzdGluY3QobWF0Y2gsIC5rZWVwX2FsbCA9IFQpICU+JSANCiAgbXV0YXRlKA0KICAgIHRlYW1fd2luID0gZmFjdG9yKHRlYW1fd2luLCBsZXZlbHMgPSBjKDEsIDIpKQ0KICApICU+JSANCiAgbGVmdF9qb2luKA0KICAgIGRhdGEudHJlZSR0ZW1wLmRhdGEkZ2FtZUluZm8udGFncyAlPiUgDQogICAgICBmaWx0ZXIodGVhbSA9PSAxKSAlPiUgDQogICAgICByZW5hbWVfd2l0aCgNCiAgICAgICAgLmZuID0gZnVuY3Rpb24oeCl7DQogICAgICAgICAgDQogICAgICAgICAgcGFzdGUwKHgsICJfMSIpICU+JSANCiAgICAgICAgICAgIHJldHVybigpDQogICAgICAgICAgDQogICAgICAgIH0sDQogICAgICAgIC5jb2xzID0gMzo4DQogICAgICApICU+JSANCiAgICAgIHNlbGVjdCghdGVhbSksDQogICAgYnkgPSAibWF0Y2giDQogICkgJT4lIA0KICBsZWZ0X2pvaW4oDQogICAgZGF0YS50cmVlJHRlbXAuZGF0YSRnYW1lSW5mby50YWdzICU+JSANCiAgICAgIGZpbHRlcih0ZWFtID09IDIpICU+JSANCiAgICAgIHJlbmFtZV93aXRoKA0KICAgICAgICAuZm4gPSBmdW5jdGlvbih4KXsNCiAgICAgICAgICANCiAgICAgICAgICBwYXN0ZTAoeCwgIl8yIikgJT4lIA0KICAgICAgICAgICAgcmV0dXJuKCkNCiAgICAgICAgICANCiAgICAgICAgfSwNCiAgICAgICAgLmNvbHMgPSAzOjgNCiAgICAgICkgJT4lIA0KICAgICAgc2VsZWN0KCF0ZWFtKSwNCiAgICBieSA9ICJtYXRjaCINCiAgKSAlPiUgDQogIG11dGF0ZV9pZihpcy5pbnRlZ2VyLCBhcy5mYWN0b3IpDQoNCmRhdGEudHJlZSR0ZW1wLmRhdGEkZ2FtZUluZm8udHJlZQ0KYGBgDQoNCiMgU2V0dGluZyB1cCBUcmFpbmluZyAvIFRlc3QgRGF0YQ0KYGBge3J9DQojIFNldHRpbmcgU2VlZCBmb3IgUmVwcm9kdWNpYmlsaXR5DQpzZXQuc2VlZCgzKQ0KZGF0YS50cmVlJHRlbXAuZGF0YSRzYW1wbGUgPC0gc2FtcGxlKGRhdGEudHJlZSR0ZW1wLmRhdGEkZ2FtZUluZm8udHJlZSRtYXRjaCwgbnJvdyhkYXRhLnRyZWUkdGVtcC5kYXRhJGdhbWVJbmZvLnRyZWUpKi43KQ0KZGF0YS50cmVlJHRlbXAuZGF0YSR0cmFpbiA8LSBkYXRhLnRyZWUkdGVtcC5kYXRhJGdhbWVJbmZvLnRyZWUgJT4lIA0KICBmaWx0ZXIobWF0Y2ggJWluJSBkYXRhLnRyZWUkdGVtcC5kYXRhJHNhbXBsZSkNCmRhdGEudHJlZSR0ZW1wLmRhdGEkdGVzdCA8LSBkYXRhLnRyZWUkdGVtcC5kYXRhJGdhbWVJbmZvLnRyZWUgJT4lIA0KICBmaWx0ZXIoIW1hdGNoICVpbiUgZGF0YS50cmVlJHRlbXAuZGF0YSRzYW1wbGUpDQpgYGANCg0KIyBHZW5lcmF0aW5nIFJhbmRvbSBGb3Jlc3QNCmBgYHtyfQ0Kc2V0LnNlZWQoMykNCmRhdGEudHJlZSRtb2RlbHMkdGVhbUNvbXBfZm9yZXN0IDwtIHJhbmRvbUZvcmVzdCgNCiAgdGVhbV93aW4gfiAuIC0gbWF0Y2gsDQogIGRhdGEgPSBkYXRhLnRyZWUkdGVtcC5kYXRhJHRyYWluLA0KICBudHJlZSA9IDUwMCwNCiAgaW1wb3J0YW5jZSA9IFRSVUUsDQogIG5hLmFjdGlvbiA9IG5hLm9taXQNCikNCg0KZGF0YS50cmVlJG1vZGVscyR0ZWFtQ29tcF9mb3Jlc3QNCmBgYA0KYGBge3J9DQppbXBvcnRhbmNlKGRhdGEudHJlZSRtb2RlbHMkdGVhbUNvbXBfZm9yZXN0KQ0KdmFySW1wUGxvdChkYXRhLnRyZWUkbW9kZWxzJHRlYW1Db21wX2ZvcmVzdCkNCmBgYA0KTGV0J3MgY29tcGFyZSB0byBhIHNpbXBsZSBibHVlIHNpZGUgYWx3YXlzIHdpbnMgY2xhc3NpZmllcjoNCmBgYHtyfQ0KZGF0YS50cmVlJHRlbXAuZGF0YSRnYW1lSW5mby50cmVlICU+JSANCiAgY291bnQodGVhbV93aW4pICU+JSANCiAgbXV0YXRlKG4gPSBuL3N1bShuKSkNCmBgYA0KV2VsbCwgaXQncyBzbGlnaHRseSBiZXR0ZXIgdGhhbiB0aGUgbmFpdmUgYmx1ZSBzaWRlIHdpbiBjbGFzc2lmaWVyIGJ1dCBjbGVhcmx5IHRoZSBudW1iZXIgb2YgY2hhbXBpb25zIHdpdGggdGFncyBpc24ndCBhIHZlcnkgc3Ryb25nIHByZWRpY3RvciBvZiB0ZWFtIHN1Y2Nlc3MuIFdpdGggdGhlIGN1cnJlbnQgY29kaW5nLCBJJ20gZmFpcmx5IGNlcnRhaW4gdGhhdCB0aGVyZSB3b24ndCByZWFsbHkgYmUgYSByb2J1c3QgY2xhc3NpZmllci4NCg0KTGV0J3MgdHJ5IHRvIGlkZW50aWZ5IGNsdXN0ZXJzIG9mIGNoYW1waW9uIHR5cGVzLg0KIyBHZW5lcmF0aW5nIElucHV0IFRlYW0gU2VudGVuY2VzIA0KYGBge3J9DQpjaGFtcGlvbkNsdXN0ZXIkdGVtcC5kYXRhJHRlYW1zIDwtIGdhbWVJbmZvICU+JSANCiAgc2VsZWN0KG1hdGNoLCB3aW4sIGNoYW1waW9uTmFtZSkgJT4lIA0KICBncm91cF9ieShtYXRjaCwgd2luKSAlPiUgDQogIG11dGF0ZShjaGFtcGlvbk51bWJlciA9IHJvd19udW1iZXIoKSkgJT4lIA0KICBwaXZvdF93aWRlcigNCiAgICBuYW1lc19mcm9tID0gY2hhbXBpb25OdW1iZXIsDQogICAgdmFsdWVzX2Zyb20gPSBjaGFtcGlvbk5hbWUNCiAgKSAlPiUgDQogIHRyYW5zbXV0ZShtYXRjaCA9IG1hdGNoLCB3aW4gPSB3aW4sIHRlYW0gPSBzdHJfYyhgMWAsYDJgLGAzYCxgNGAsYDVgLCBzZXAgPSAiICIpKSAlPiUgDQogIHVuZ3JvdXAoKSAlPiUgDQogIHNlbGVjdCh0ZWFtKQ0KDQpjaGFtcGlvbkNsdXN0ZXIkdGVtcC5kYXRhJHRlYW1zDQpgYGANCiMgR2VuZXJhdGluZyBNb2RlbA0KYGBge3J9DQpzZXQuc2VlZCgzKQ0KY2hhbXBpb25DbHVzdGVyJG1vZGVscyRubHBNb2RlbCA8LSB3b3JkMnZlYygNCiAgeCA9IGNoYW1waW9uQ2x1c3RlciR0ZW1wLmRhdGEkdGVhbXMkdGVhbSwgDQogIHR5cGUgPSAic2tpcC1ncmFtIiwgDQogIGRpbSA9IDIwLCANCiAgaXRlciA9IDE1DQopDQoNCiMgRW1iZWRkaW5nIE1hdHJpeA0KY2hhbXBpb25DbHVzdGVyJG1vZGVscyRlbWJlZGRpbmdNYXRyaXggPC0gYXMubWF0cml4KGNoYW1waW9uQ2x1c3RlciRtb2RlbHMkbmxwTW9kZWwpDQoNCiMgQXBwbHlpbmcgVFNuZSANCmNoYW1waW9uQ2x1c3RlciRtb2RlbHMkVHNuZSA8LSBSdHNuZShjaGFtcGlvbkNsdXN0ZXIkbW9kZWxzJGVtYmVkZGluZ01hdHJpeCwgcGNhID0gRkFMU0UpDQoNCmNoYW1waW9uQ2x1c3RlciRwbG90cyRtYXAgPC0gY2hhbXBpb25DbHVzdGVyJG1vZGVscyRUc25lJFkgJT4lIA0KICBhcy5kYXRhLmZyYW1lKCkgJT4lDQogIG11dGF0ZShjaGFtcGlvbiA9IHJvdy5uYW1lcyhjaGFtcGlvbkNsdXN0ZXIkbW9kZWxzJGVtYmVkZGluZ01hdHJpeCkpICU+JQ0KICBnZ3Bsb3QoYWVzKHggPSBWMSwgeSA9IFYyLCBsYWJlbCA9IHdvcmQpKSArIA0KICBnZW9tX3RleHQoc2l6ZSA9IDMpIA0KDQpjaGFtcGlvbkNsdXN0ZXIkcGxvdHMkbWFwIDwtIGNoYW1waW9uQ2x1c3RlciRwbG90cyRtYXAgJT4lIA0KICBnZ3Bsb3RseSgpDQoNCmNoYW1waW9uQ2x1c3RlciRwbG90cyRtYXAgDQpgYGANClByZXR0eSBjbGVhcmx5IDUgbWFpbiBjbHVzdGVycyBvZiBjaGFtcGlvbnMgZWFjaCBjb3JyZXNwb25kaW5nIHRvIGEgcm9sZS4gRG9lc24ndCByZWFsbHkgaGVscCB0b28gbXVjaCBpbiBkZXRlcm1pbmluZyB0ZWFtIGNvbXBvc2l0aW9ucy4gSSBjb3VsZCBzZXQgdXAgYSBLTk4gdG8gdmVyaWZ5IHRoaXMgYnV0IGl0IHNlZW1zIHByZXR0eSBjbGVhciBjdXQgdG8gbWUuDQo=